If you want to find the position of a match in an incoming string, simply check the length of $` (That's \(PREMATCH if you've chosen to use English;) to check where it starts, and add the length of \)& (that's $MATCH) to find where it ends.
Let's say I want to find all the URLs refered to in a web page that's loaded into the variable $html. I could write:
push @section,[length($`),length($&),$1] while ($html =~ m/(https?://[^ >"]+)/g);
and that will give me a list of 3-element lists containing start point, length and actual string matched. Here's the code to display that list:
foreach $element(@section) { print (join(", ",@$element),"\n"); }
and here's some of the results from the sources of our resources index
5979, 36, http://www.wellho.net/forum/top.html 6967, 36, http://www.wellho.net/net/mouth.html 7059, 42, http://www.wellho.net/downloads/index.html 8369, 67, http://www.wellho.net/mouth/387_Training-course-plans-for-2006.html 9365, 43, http://www.trainingcenter.co.uk/travel.html 9516, 45, http://reiseauskunft.bahn.de/bin/query.exe/en 9599, 59, http://www.livedepartureboards.co.uk/ldb/summary.aspx?T=MKM 9861, 48, https://lightning.he.net/~wellho/net/secure.html
P.S. I loaded my whole web page into a single variable using the code
open (FH,"/Library/WebServer/live_html/resources/index.html"); undef $/; $html = <FH>;
So, the whole script could be:
open FH,$ARGV[0] or die $!; undef $/; $html = <FH>; push @section,[length($`),length($&),$1] while ($html =~ m!(https?://[^ >"]+)!g); foreach $element(@section) { print (join(", ",@$element),"\n"); }
Hide Comments